Skip to content

Conversation

@TheCelavi
Copy link
Member

No description provided.

Copilot AI review requested due to automatic review settings December 23, 2025 15:07
@TheCelavi TheCelavi merged commit 2eee870 into master Dec 23, 2025
3 checks passed
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request introduces a left join operator for dataset streams, mimicking SQL LEFT JOIN functionality. The implementation allows joining two iterables based on strict key equality, yielding tuples containing the source value and an iterable of matched values from the joined stream.

  • Implements LeftJoin operator and IndexedCollector for grouping values by non-unique keys
  • Adds leftJoin() method to Stream class with corresponding helper function
  • Makes map() method's valueTransform parameter nullable to support key-only transformations
  • Introduces UnsupportedException for unsupported operations
  • Adds stream() method to Buffer model for convenient stream creation

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
src/Operator/LeftJoin.php New operator implementing left join logic using IndexedCollector for right-side indexing
src/Collector/IndexedCollector.php New collector that indexes values by keys (supporting duplicates) for efficient lookup
src/Exception/UnsupportedException.php New exception class for unsupported operations
src/functions.php Adds left_join() helper function for creating LeftJoin operators
src/Stream.php Adds leftJoin() method and makes map() parameters nullable for named argument support
src/Model/Buffer.php Adds stream() method to create Stream instances from buffers
tests/Operator/LeftJoinTest.php Comprehensive tests for left join functionality including duplicate key handling
tests/Collector/IndexCollectorTest.php Tests for IndexedCollector covering iteration, array access, aggregation, and error cases
tests/StreamTest.php Integration test for Stream's leftJoin method
tests/Model/BufferTest.php Test for Buffer's new stream method
docs/source/components/dataset/operators/list/left_join.rst Documentation with examples for the left join operator
docs/source/components/dataset/operators/list/index.rst Adds left_join to operator index

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

* @see Operator\Map
*/
public function map(callable $valueTransform, ?callable $keyTransform = null): self
public function map(?callable $valueTransform = null, ?callable $keyTransform = null): self
Copy link

Copilot AI Dec 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

API design issue: Making valueTransform nullable breaks the Map operator's constraint. The Map operator requires at least one transformation function (line 59-61 in Map.php throws LogicException if both are null). However, this change makes it syntactically possible to call map() with no arguments, which would result in a runtime exception. Consider adding validation or documentation to clarify that at least one parameter must be provided, or ensure the Map operator gracefully handles this case.

Copilot uses AI. Check for mistakes.
4 => 5,
], $data);
}

Copy link

Copilot AI Dec 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing PHPUnit Test attribute. This test method should be annotated with the #[Test] attribute to be recognized and executed by PHPUnit, consistent with other test methods in this file (e.g., line 180).

Suggested change
#[Test]

Copilot uses AI. Check for mistakes.
], $data);
}

public function left_join(): void
Copy link

Copilot AI Dec 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent naming convention. The test method name uses snake_case (left_join) while other test methods in this file use camelCase (e.g., map on line 181, flatten visible in context). Consider renaming to leftJoin for consistency.

Suggested change
public function left_join(): void
public function leftJoin(): void

Copilot uses AI. Check for mistakes.
}

#[Test]
public function left_joins_with_duplicate_right_keys(): void
Copy link

Copilot AI Dec 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent naming convention. The test method name uses snake_case (left_joins_with_duplicate_right_keys) while the other test method in this file uses camelCase (joins on line 15). Consider renaming to leftJoinsWithDuplicateRightKeys for consistency.

Suggested change
public function left_joins_with_duplicate_right_keys(): void
public function leftJoinsWithDuplicateRightKeys(): void

Copilot uses AI. Check for mistakes.
continue;
}

throw new UnsupportedException('Only object, string and integer keys are supported.');
Copy link

Copilot AI Dec 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incorrect PHPDoc comment. The doc comment states "Only object, string and integer keys are supported" but the code also handles float, bool, and null types as scalar values (line 84 checks \is_string($key) || \is_int($key) which would fail for other scalar types, but the condition structure suggests broader scalar support was intended). Consider either updating the error message to be accurate or adjusting the code to support all scalar types as documented.

Copilot uses AI. Check for mistakes.
return match (true) {
\is_string($offset) || \is_int($offset) => \array_key_exists($offset, $this->scalarIndex),
\is_object($offset) => $this->objectIndex->contains($offset),
default => throw new UnsupportedException('Only object, string and integer keys are supported.'),
Copy link

Copilot AI Dec 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incorrect PHPDoc comment. The doc comment states "Only object, string and integer keys are supported" but line 19 in the class documentation claims "Currently, the allowed key types are scalar values (int, float, string, bool, null) and objects." The implementation only supports int, string, and object keys (lines 84, 90, 118, 138). Either update the error message to match the implementation or extend the implementation to support all scalar types as documented.

Copilot uses AI. Check for mistakes.
return match (true) {
\is_string($offset) || \is_int($offset) => $this->scalarIndex[$offset],
\is_object($offset) => $this->objectIndex[$offset],
default => throw new UnsupportedException('Only object, string and integer keys are supported.'),
Copy link

Copilot AI Dec 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incorrect PHPDoc comment. The doc comment states "Only object, string and integer keys are supported" which is inconsistent with line 19 in the class documentation that claims support for all scalar types (int, float, string, bool, null) and objects. The implementation only supports int, string, and object keys. This error message should be consistent with the actual implementation and class-level documentation.

Copilot uses AI. Check for mistakes.
Comment on lines +19 to +20
* Currently, the allowed key types are scalar values
* (int, float, string, bool, null) and objects.
Copy link

Copilot AI Dec 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistency between documentation and implementation. The class-level documentation (line 19) states that "allowed key types are scalar values (int, float, string, bool, null) and objects", but the actual implementation only supports int, string, and object keys (as seen in lines 84-98, 117-121, and 137-141). Either update the documentation to accurately reflect the implementation, or extend the implementation to support all mentioned scalar types.

Suggested change
* Currently, the allowed key types are scalar values
* (int, float, string, bool, null) and objects.
* Currently, the allowed key types are integers, strings, and objects.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants